ARCH Models (Volatility Models)#
Goals#
Precisely define conditional heteroskedasticity.
Explain why we model variance (volatility) instead of the mean for many financial return series.
Build ARCH(q) variance equations in LaTeX and implement them from scratch in NumPy.
Visualize volatility clustering and variance forecasts with Plotly.
Conditional heteroskedasticity (definition)#
Let \(r_t\) be a return (or mean-zero residual) and \(\mathcal{F}_{t-1}\) the information available up to time \(t-1\).
A standard decomposition is:
Homoskedastic models assume \(h_t = \sigma^2\) is constant.
Conditionally heteroskedastic models assume \(h_t\) is time-varying and depends on past information \(\mathcal{F}_{t-1}\).
ARCH/GARCH models explicitly specify a recursion for \(h_t\) (the conditional variance).
Why model variance (volatility) instead of the mean?#
For many liquid financial assets at daily (or higher) frequency:
The conditional mean \(\mathbb{E}[r_t\mid\mathcal{F}_{t-1}]\) is often small and hard to predict reliably.
The conditional variance \(h_t\) shows strong, stable structure (persistence / clustering), making it forecastable.
Many decisions depend more on risk than drift: even if the expected return is near 0, the uncertainty is not.
Financial use cases#
Risk management: Value-at-Risk (VaR), Expected Shortfall, stress testing, margin.
Portfolio construction: volatility targeting, risk parity, dynamic leverage.
Derivatives: volatility inputs/forecasts (and a bridge to option-implied volatility).
Trading & execution: position sizing, stop placement, liquidity/impact models.
Volatility clustering + market-shock intuition#
Volatility clustering: large moves tend to be followed by large moves (of either sign), and small moves tend to be followed by small moves.
A simple shock story:
Suppose a big negative news event hits at time \(t\).
The return shock \(\varepsilon_t\) is large in magnitude, so \(\varepsilon_t^2\) is large.
In ARCH models, the next conditional variance \(h_{t+1}\) increases because it loads on past squared shocks.
With higher \(h_{t+1}\), the next innovation \(\varepsilon_{t+1}\) is more likely to be large in magnitude, so clusters form.
ARCH(q) model (LaTeX)#
ARCH models specify the conditional variance as a weighted sum of recent squared innovations.
Parameter constraints (to keep \(h_t\) positive)#
\(\omega > 0\)
\(\alpha_i \ge 0\) for all \(i\)
Stationarity / finite unconditional variance (common condition)#
A typical second-order stationarity condition is:
which implies the long-run (unconditional) variance:
Variance equation breakdown (what each term does)#
\(\omega\) sets the baseline scale (and pins down the long-run variance when combined with \(\alpha\)’s).
\(\alpha_i\) controls how strongly a past shock affects current volatility.
Squaring removes the sign: both positive and negative shocks increase variance.
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import os
import plotly.io as pio
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
pio.templates.default = "plotly_white"
np.set_printoptions(precision=4, suppress=True)
def simulate_arch(T: int, omega: float, alpha, mu: float = 0.0, burn: int = 300, seed: int = 7):
"""Simulate an ARCH(q) process with Gaussian innovations.
Model:
r_t = mu + eps_t
eps_t = z_t * sqrt(h_t)
h_t = omega + sum_{i=1..q} alpha_i * eps_{t-i}^2
"""
alpha = np.asarray(alpha, dtype=float)
q = int(alpha.size)
if q < 1:
raise ValueError("alpha must have at least one element (q>=1)")
if omega <= 0:
raise ValueError("omega must be > 0")
if np.any(alpha < 0):
raise ValueError("all alpha_i must be >= 0")
alpha_sum = float(alpha.sum())
if alpha_sum >= 1:
raise ValueError("Need sum(alpha) < 1 for finite unconditional variance in this demo")
h_bar = omega / (1.0 - alpha_sum)
n = int(T + burn)
rng = np.random.default_rng(seed)
z = rng.standard_normal(n)
eps = np.zeros(n)
h = np.full(n, h_bar)
eps[:q] = np.sqrt(h_bar) * z[:q]
for t in range(q, n):
# h_t = omega + sum_i alpha_i * eps_{t-i}^2
lagged_eps_sq = eps[t - np.arange(1, q + 1)] ** 2
h[t] = omega + float(np.dot(alpha, lagged_eps_sq))
eps[t] = np.sqrt(h[t]) * z[t]
eps = eps[burn:]
h = h[burn:]
r = mu + eps
return r, eps, h
def arch_conditional_variance(eps, omega: float, alpha, initial_variance: float | None = None):
"""Compute h_t for an ARCH(q) model given a residual series eps_t."""
eps = np.asarray(eps, dtype=float)
alpha = np.asarray(alpha, dtype=float)
q = int(alpha.size)
if q < 1:
raise ValueError("alpha must have at least one element (q>=1)")
if omega <= 0:
raise ValueError("omega must be > 0")
if np.any(alpha < 0):
raise ValueError("all alpha_i must be >= 0")
if initial_variance is None:
alpha_sum = float(alpha.sum())
if alpha_sum >= 1:
raise ValueError("Need sum(alpha) < 1 to use the unconditional variance as initialization")
initial_variance = omega / (1.0 - alpha_sum)
h = np.full(eps.size, float(initial_variance))
for t in range(q, eps.size):
lagged_eps_sq = eps[t - np.arange(1, q + 1)] ** 2
h[t] = omega + float(np.dot(alpha, lagged_eps_sq))
return h
def arch_forecast_variance(eps, omega: float, alpha, horizon: int):
"""Multi-step variance forecasts for ARCH(q).
Uses E[eps_{t+k}^2 | F_t] = h_{t+k|t} (since Var(z)=1).
"""
eps = np.asarray(eps, dtype=float)
alpha = np.asarray(alpha, dtype=float)
q = int(alpha.size)
if eps.size < q:
raise ValueError("Need at least q residuals to forecast")
eps_sq_lags = eps[-np.arange(1, q + 1)] ** 2 # [eps_t^2, eps_{t-1}^2, ...]
forecasts = np.zeros(int(horizon), dtype=float)
for k in range(int(horizon)):
h_next = omega + float(np.dot(alpha, eps_sq_lags))
forecasts[k] = h_next
eps_sq_lags = np.concatenate([[h_next], eps_sq_lags[:-1]])
return forecasts
# Example: simulate an ARCH(2) return series to visualize volatility clustering
T = 2000
mu = 0.0
omega = 0.05
alpha = np.array([0.25, 0.15]) # sum < 1
r, eps, h = simulate_arch(T=T, omega=omega, alpha=alpha, mu=mu, seed=42)
view = 600
df = pd.DataFrame(
{
"t": np.arange(T),
"return": r,
"eps_sq": eps**2,
"h": h,
"sigma": np.sqrt(h),
}
).tail(view)
fig = make_subplots(
rows=2,
cols=1,
shared_xaxes=True,
vertical_spacing=0.06,
subplot_titles=("Simulated returns", "Volatility clustering: squared returns vs conditional variance"),
)
fig.add_trace(go.Scatter(x=df["t"], y=df["return"], mode="lines", name="r_t"), row=1, col=1)
fig.add_trace(
go.Scatter(x=df["t"], y=df["eps_sq"], mode="lines", name=r"$\varepsilon_t^2$", line=dict(width=1)),
row=2,
col=1,
)
fig.add_trace(
go.Scatter(x=df["t"], y=df["h"], mode="lines", name=r"$h_t$", line=dict(width=2)),
row=2,
col=1,
)
fig.update_yaxes(title_text="return", row=1, col=1)
fig.update_yaxes(title_text="variance", row=2, col=1)
fig.update_xaxes(title_text="time", row=2, col=1)
fig.update_layout(height=650, legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="left", x=0))
fig.show()
# Variance forecasts: h_{T+k | T}
horizon = 60
h_fore = arch_forecast_variance(eps, omega=omega, alpha=alpha, horizon=horizon)
alpha_sum = float(alpha.sum())
h_bar = omega / (1.0 - alpha_sum)
lookback = 250
hist_x = np.arange(T - lookback, T)
fore_x = np.arange(T, T + horizon)
fig = go.Figure()
fig.add_trace(go.Scatter(x=hist_x, y=h[-lookback:], mode="lines", name=r"historical $h_t$"))
fig.add_trace(go.Scatter(x=fore_x, y=h_fore, mode="lines", name=r"forecast $h_{T+k|T}$"))
fig.add_trace(
go.Scatter(
x=np.concatenate([hist_x, fore_x]),
y=np.full(hist_x.size + fore_x.size, h_bar),
mode="lines",
name=r"long-run $\bar h$",
line=dict(dash="dash"),
)
)
fig.update_layout(
title="ARCH variance forecast (mean reversion toward long-run variance)",
xaxis_title="time",
yaxis_title="variance",
height=450,
)
fig.show()
Notes / pitfalls#
ARCH models react to recent shocks but can require a large \(q\) to capture the long persistence seen in markets.
A frequent extension is GARCH, which adds lagged variance terms to get slow decay with few parameters.
Stationarity conditions above target finite variance; stricter stationarity conditions can be more nuanced.